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1. Introduction 



Non-logarithmic information measures have become vary fashionable nowadays, with 
multiple applications to different scientific disciplines (see, for instance, and references 
therein). They were introduced in the cybernetic-information communities by Harvda- 
Charvat [2] in 1967 and Vadja in 1968, and rediscovered by Daroczy in 1970 with 
several echoes mostly in the field of image processing: see |5| for a historic summary and the 
pertinent references. In astronomy, physics, economics, biology etc., these non-logarithmic 
information measures are often used under the form of the q— entropies as introduced by 
Tsallis since 1988 0. 

These entropies are maximized by power-type distributions. The properties of both discrete 
and continuous power-type distributions have been carefully reviewed recently in Ref. in 
what respects to 

(i) their behavior by convolution and 

(ii) their relationships with stable Levy distributions. 

In this paper, we wish to focus attention more closely on further properties of these 
distributions, and answer some open questions as raised in 0; this way, we hope, in the 
wake of Refs. J51 EH, to positively contribute to a more complete understanding of the 
ensuing theoretical context. 



2. Definitions and Notations 

In what follows we consider some probability density fx (X 6 M. N ) that maximize a 
generalized entropy, either of the Harvda-Charvat-Renyi type 

H q (X) = -^—]og(f f x (X)dx). (1) 

or of the Tsallis type 

S q (X) = ^—(l-[ f% (X) dx) . (2) 

where q is a real parameter (called "nonextensivity parameter" in [1 J). As H q can be expressed 
as an increasing function of S q , both entropies have the same maximizers. As a consequence, 
all results expressed in this paper hold for both types of entropies, except in Section |6] that 
deals with a special property of S q . To each density f x , we associate its so-called escort 
distribution [ 1 1 1 defined as 

F x (X) = & — 
Note that the dependence of F x on q is not explicitly stated for notational simplicity. 
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2.1. Power-law distributions as entropy -maximizer s 



The following theorem generalizes to the n— variate case the characterization given in Ref. 
EJ Eq. (42)] for the maximum entropy distributions with fixed q— covariance. 



Theorem 1 Under the q-covariance constraint 
J XX T F X (X) dX = K 



(where the q— covariance matrix K is symmetric definite positive) and the normalization 
constraint J fx = 1, the power-law entropy or (0) has a single maximizer equal to: 
• ifl<q<^ 



fx(X) = A q (1 + X T A- 1 X)'- 



(3) 



with 



A, 



A = mK, m 
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+ n 



and with notation (x), = max (0, x). 

In the case n = 1 we recover the results of Q eq. (42)], namely 
• if 1 < q < 3 



fx(x) 



T {^l)^ J ( (q-l\ 



if g < 1 



/x (a;) 



5-3g 



2( L-v) / V 1 ? / ^ 1 - f/ \ \ 1 '< 



(5) 



(6) 



Note the existence of a minor typo in [7 1 for the definition of A q in the case q > 1 (replace 
2/(1 — g) by 2 (1 — q)). For the correct expression see also [fT2l. 
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2.2. Student-t and Student-r distributions 

In statistics, distribution © is called an n— variate Student-t with m degrees of freedom and 
q— covariance matrix K: it will be denoted as T (m, n, K) in the following. We notice that its 
nonextensivity parameter q is linked to the dimension n and the number of degrees of freedom 
m by 

m + n + 2 

^ = 1 • 

m + n 

Moreover, convergence of both integrals J f x (X) dX and J XX T F X (X) dX requires the 
same condition, namely q < or equivalently m > 0. In the next section, we will endow 
parameter m with a meaning. 

Accordingly, distribution © is an n— variate Student-r with p degrees of freedom and 
q— covariance matrix K: it will be denoted as 1Z (p, n, K) . We remark that its nonextensivity 
parameter q is linked to parameter p and dimension n as 

p — n — 2 

Q = • 

p — n 

2.3. Stochastic representations 

Beck and Cohen ifTTl have recently introduced in the literature an interesting statistical 
concept, baptized with the name super statistics, that links different types of probability 
densities. In this vein, our distributions above can be shown to correspond to multivariate 
Gaussian densities whose covariance matrix fluctuates according to a certain law, as detailed 
in the two following theorems. 

Theorem 2 If X follows a T (m,n, K) distribution then a stochastic representation of X 
writes \ 

A 1/2 G 

X (7) 

a 

where G is an n— variate Gaussian vector with unit covariance matrix, a is a random 
variable independent of G that follows a x distribution^, with number of degrees of freedom 

m = — n and A = mK. 

9-1 

A remarkable fact deserves here emphasizing upon: this approach can be extended to the case 
when q < 1, with a noticeable difference. This extension is based on the following duality 
result. 

Theorem 3 if X ~ T (m, n, K) and A = mK then random vector Y defined as 

Y = , X (8) 
Vl + X T A- 1 X 

X in the following, sign ~ means "is distributed as" 

§ a chi distribution is f a {a) — ^j-a'"" 1 exp(— a 2 /2); chi distributions are restricted to integer degrees of 
freedom. If m ^ N then the \ distribution should be extended to the distribution of the square-root of a gamma 
random variable with shape parameter equal to 2m. For the sake of simplicity, we will speak of \ distribution in 
this case too. 
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is such that Y ~ 1Z (p, n, K) with 
p = m + n — 2. 

If q and q' denote the respective nonextensivity indices of X and Y, then 
1 1 n 

1 - q 1 ~ q-1 ~ 2 ~ ' 



(9) 



In Figure[T]below, values of q' as a function of q are plotted for n = 1,2,5 and 10 (right to left). 
We remark that transformation © induces a one-to-one relationship between q £ [l, [ and 
£ ]— oo, 1] and has the Gaussian distribution (q = q' = 1) as fixed point. 




Figure 1. q' as a function of q as in (|9j for n — 1,2,5 and 10 (right to left) 

An important consequence of the above is the following dual result of theorem ©. 

Theorem 4 If Y follows a TZ (p, n, K) distribution then a stochastic representation of Y 
writes 

T}' 2 G 



Y 



(10) 



VG T G + b 2 

where G is an n—variate Gaussian vector with unit covariance matrix, where E = 
(p — n + 2) K and b is a random variable independent of G that follows a x distribution 
with p — n + 2 degrees of freedom. 

Here, the important difference, as compared to the case q > 1, is to be found in the fact that 
the fluctuations, represented by the denominator of (ITOb . are now dependent of the values of 
the Gaussian system through the presence of term G T G. 



2.4. Covariance matrices 

The covariance matrices R = EXX T of both distributions are related to their q— covariance 
matrices as follows. 
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Theorem 5 Distribution T (m, n, K) has covariance matrix 

R=-^K (11) 
m — 2 

provided m > 2, that is q < For example, a finite covariance matrix exists in the case 
n = 1 only ifl < q < |. 

Proof 

Using the stochastic representation 0, we deduce 

EXX^ = EGG? E—, 

or 

with Ea~ 2 = -±= and EGG T = A = mK □ 

m—2 

Theorem 6 Distribution 1Z (p, n, K) has covariance matrix 
= P-u+2 
p + 2 

Proof 

The proof uses the polar factorization property [fT31l of stochastic representation (TTOb . namely 
the fact that , G and V G T G + b 2 are independent. As a consequence 

Y G T G+b 2 

T} /2 EGG T T} /2 p-n + 2 
E (G T G + b 2 ) p + 2 

We note that in the Gaussian case (p — ► +oo in (fT2l orm^ +oo in (fTTT) '). the q— covariance 
and the variance matrices coincide. 



2.5. Geometric characterization 

Geometric characterizations of both distributions © and @ in terms of projections of the 
uniform distribution on the sphere in R™ are detailed in ifbH . According to the stochastic 
representation dTTIt. £ -1 / 2 F can be interpreted, if p E N, as the marginal vector of a 
(p + 2) — variate random vector uniformly distributed on the sphere in W +2 . A link between 
this observation and the role of extended information measures in the microcanonical 
framework can be found in iflH . 



3. The stability issue 

As noted in [7], distributions © and @ are not stable by convolution since they do not belong 
to the Levy class: the sum of two independent random variables following either distribution 
© or distribution @ does not follow any of these distributions again, as opposed to the 
Maxwellian-Gaussian case. It is then suggested in Q that, in order to recover the original 
distribution after summation, a certain kind of dependence should be introduced between the 
components of the sum. 

It is the aim of the next paragraph to show that such dependence can be accurately 
characterized in the case of power-law distributions. 
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3.1. A first example: case q > 1 



Let us assume, for instance, that q > 1 and choose X to be a random vector of dimension 
n distributed according to ©. We extract from it two scalar components, say Xi and X 2 ; 
according to Q, these two components can be expressed as 



(A 1 / 2 G) 1 



X, 



a a 
where denotes the first vector component. 



(13) 



Distribution of the components 

We first remark from stochastic representation (IT3b that X\ and X 2 are again distributed 
according to a Student-t distribution with dimension n — 1; moreover, the extraction 
of components keeps the fluctuation variable a unchanged, so that both X\ and X 2 have 
unchanged number of degrees of freedom m! = m = — n. Both have thus a new 
nonextensivity parameter q' that verifies 
2 2 



q' - 1 



1 



or equivalently 



q' = l + 



q-1 
2(q 



— n 



r r- (14) 

2 + (l-g)(n-l) 

Moreover, it is easy to check that their respective q— variances are K\\ and K 22 , the two first 
diagonal entries of q— covariance matrix K. The three curves in Figure |2] represent q' as a 
function of q for n = 2, 5 and 10 (from right to left). 




Figure 2. q 1 as a function of q as in dl4> for n — 2,5 and 10 (right to left) 



We note that 



q = 1 q 1 = 1, since any component of a Gaussian vector is Gaussian 

the nonextensivity parameter q' of a single component is larger than the nonextensivity 
parameter q of the system it is extracted from 
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• moreover, q' is all the larger since the dimension n is large. 

Distribution of the convolution 

The distribution of a linear combination Z of X\ and X 2 can be computed as 

Z = aXi + (3X 2 ~ - (a (A 1 / 2 G) 1 + (3 (A 1/2 G) 2 ) 

~ vW« 2 ^n + [3 2 K 22 + 2a(3K l2 -, (15) 

a 

so that Z is again distributed as a Student-t distribution with same parameter m and 
q— variance a 2 K u + (3 2 K 22 + 2a(3Ki 2 . We underline the fact that stability under convolution 
originates from the special type of dependence that exists between the components X\ and 
X 2 , namely from the fact that they belong to a same (larger) system: in more physical terms, 
Xi and X 2 are components that have experienced the same random source of fluctuations. 



3.2. A second example: case q < 1 

We assume now that we extract two components Y\ and Y 2 from a vector Y ~ 1Z (p, n, K) 
Then a stochastic representation of Y\ and Y 2 is 



(Y}I 2 G) 1 (g/sg) 
Yi ~ ==, r 2 ~ / = = = 



2 



v / G T G + b 2 VG T G + b 2 

so that Yi (resp. F 2 ) follows a distribution TZ(p' ,1, K') with p' = p, K' = K u (resp. 

= fT 22 ) and its new index of nonextensivity verifies 

2 2 

+ 1 = z ; + n 



l-q> ' l-q> 
or 

2 (1 - q) 



2 + (n-l)(l- ff )- (16) 
We remark that (fToT) coincides with dT~4l> since conservation of degrees of freedom m in the 
Student-t case and p in the Student-r case is expressed by the same condition. 
In Figure |5]below, q' is represented as a function of q for n = 2, 5 and 10 (bottom to top). 
The same conclusions as in the case q > 1 hold, namely: 

• if q = 1 then </ = 1 (Gaussian case) 

• the nonextensivity parameter q' of an extracted component is always larger than the 
nonextensivity parameter of the original system; it is all the larger since the dimension n 
is large 

The distribution of a linear combination can be evaluated as 

Z = aYy + (3Y 2 ~ — 



V G T G + b 2 

G 



p^a 2 K u + (3 2 K 22 + 2a(3K 12 z=== , (17) 

V G G -\- b 



so that Z ~ n (p, 1, a 2 ir n + /3 2 i\: 22 + 2a(3K 12 ) . 
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3. q f as a function of q as in d!6l for n = 2, 5 and 10 (bottom to top) 



3.3. Orthogonal invariance 



These results can be generalized using the notion of elliptical distribution lfT3l . 



Definition 7 A distribution f x is elliptical (or elliptically invariant) if it writes 



fx (X) = {X T C- X 1 X) 



for some positive definite matrix Cx called the characteristic matrix of fx and some function 
<f) that may depend ofn. 

From © and @, we check immediately that Student-t and Student-r distributions are 
elliptically invariant. This special property can be justified as follows: up to application of 
the mapping X —> A 1/2 X or Y —> £ 1/2 F, it may be assumed in ® and © that A = I n 
or S = I n : this special case of elliptical invariance is called spherical invariance. An 
equivalent definition of spherical invariance reads as follows: for all orthogonal matrices O, 
the distribution of X coincides with the distribution of X: 



Now, the H q or Sg— entropy remains unchanged by orthogonal transformation since, for 



where we have used the fact that for any orthogonal matrix, \0\ = 1. Moreover, the constraints 
under which the S^— entropy is maximized, that is 



are themselves spherically invariant as well. Thus, it is not surprising that the maximizer of S q 
under these constraints is spherically invariant - and elliptically invariant in the more general 

case C x ^ I n - 



fox (X) = fx (X) . 



example 




(18) 




Elliptical invariance of distributions of the power type: the stability and extensivity issues 10 



3.4. Properties of elliptical distributions and consequences 

The stability property exposed in parts 3.1 and 3.2 appears as a particular case of the more 
general property of elliptical distributions that we cast here as follows: 

Theorem 8 /ITU/ If X is distributed according to an elliptical distribution 

fx (X) = {X T C X 1 X) 

and if A is a (p x n) full-rank matrix with p < n then X = AX is again elliptically invariant 
with characteristic matrix 

C x = AC X A T . (19) 

As a consequence, one can characterize the precise way in which power-law random vectors 
behave under linear transformation as follows. 

Case of components' extraction 

Suppose we extract the k < n first components X' = (X 1; . . . ,X k ) from a vector of the 
power-law type X ~ T (m, n, K). This process corresponds to applying the matrix 



.4 



fcxfc : 0(n~k)xk 



to vector X, and we conclude that X' = AX ~ T (m 1 , k, K') where K' = AKA T coincides 
with the principal (k x k) block of K and m' = m, corresponding to a new index of 
extensivity 

^ 1 + 2 _ ( ' ( _\")b_ 1) > 1 - (w 

For a power law vector Y ~ 1Z (p, n, K), as remarked in part 3.2, conservation of the number 

p of degrees of freedom yields the same condition as conservation of the number m of degrees 

of freedom in the Student-t case, that is 

2 , 2 

+ k = h n 



1 — q' 1 — q 

or 

2 + (1 — q) (n — k) 

In both d20l and d2TTl . q = 1 =^ q' = 1 in d2TTl . yielding the classical property of Boltzmann 
systems, any subsystem of which is still of the Boltzmann type. 

Case of convolution 

Choosing A = [a 1; a 2 , . . . , a n ) in (fT9l yields the following results: 

n 

X ~T(m,n,K) a i X * ~ T ( m > AKA - T ) 

i=l 
ii 

Y ~1l(p,n,K) aiY * ~ ^ (P» !> ^^ T ) 

We note again that this stability result requires a special type of dependence between the 
components {Xi} or {Yi} namely the fact that they are extracted from the same system. 
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4. The stability issue for independent vectors 



Few results exist about the convolution of two independent Student-t or Student-r vectors. In 
Ref. |[T6l . Oliveira et al. remark that if X 1 and X 2 are independent and T (m, 1, a) distributed, 
their sum 

z = X 1 + x 2 

can be very accurately approximated as a T (m', 1, a') for some (m', a') depending on (m, a) . 
However, they provide only an approximation to the map (m, a) — * (rn 1 , a') . 
An important result can be stated when q > 1, in the special case for which the number m of 
degrees of freedom is an odd integer m = 21 + 1. 

Theorem 9 / 8 / If Xi and X 2 are two independent vectors following a distribution 
T (21 + 1, n, ) an d if a is such that < a < 1 and (3 = 1 — a, then the distribution 
of 

Z = aX l + [3X 2 
can be expressed as 

fz (Z) = jh 7 f +1 > (a) T Uk + 1, n, -^-) (22) 
k=i \ ' 



with 



7 <"( Q ) = [ t , ( l-.)ffl)'2--"- a W i ' ti " ! 
7 * 1 1 11 " V( 2 ')V (i -*)!(! + *)! 



x 

i=o 



Since coefficients 7^ are positive and sum to 1 (see |8| for a proof), this result can be 
interpreted as follows: the convolution of T distributions with odd degrees of freedom follows 
a T distribution whose degrees of freedom are randomized: 

f z (Z) = T\2l + 2K + 1, n, — ) (23) 

JZy 1 V 2Z + 2K+iy V ' 

where K is a random variable defined as 

Pr {K = k} = 7 f (a) , < k < I 
As an example, if n = 1 and m = 3 =^> I = 1, we have 

f z (z) = T f } T (3, 1, 1/3) + 7? } T (5, 1, 1/5) 

with 

7 { 3) = 1 - 3a (1 - a) , 7^ ) = 3a (1 - a) . 
We note that conditions < a < 1 and a + (3 = 1 are not restrictive since 

• if a < 0, then by parity of T (n, m, K), aY\ ~ (—a) Yi 
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• ifa + P^l, then aY 1 + (3Y 2 ~ (a + P) [^Y 1 + ^F 2 ) . 

An important result is the following one: formula (|22b can be extended to the case where 
Xi ~ T (m, n, K) and X 2 ~ T (m, n, if) provided Xl and X 2 have the same g— covariance 
matrix K: in that case, K~ X I 2 X\ and if _1 / 2 X 2 have identity q— covariance, so that the 
distribution of K~ l l 2 (X\ + X 2 ) can be computed using formula (122b and distribution of 
Xi + X 2 can be obtained by a simple change of variable. 



5. Another approach to the stability issue: random convolution 

A radically different approach to the problem we are discussing here, namely, the conditions 
of stability for power-type distributions, can be followed in the case q < 1 by considering the 
polar factorization property of the stochastic representation (flOb . 

Theorem 10 If Y has stochastic representation 



VG T G + b 2 

where G is a Gaussian vector with unit covariance matrix and b is x distributed, independent 
of G, then Y is independent of ^/G T G + b 2 ; we remark that the later is chi distributed with 
p + 2 = yz^ + n + 2 degrees of freedom. 

An important consequence of this property is that it allows to derive a new kind of convolution 
of random type, as expressed by the next theorem [9]. 

Theorem 11 IfYx ~ 1Z (p, n, K±) and Y 2 ~ 1Z (p, n, K 2 ) are two independent vectors, if ct\ 
and a 2 are two real scalars and a\ and a 2 are two independent chi random variables with 
d = p + 2 = t— + n + 2 degrees of freedom, and if Pi = j= q?i, P 2 = ? 2 , ct 2 then vector 



Y = p 1 Y 1 + p 2 Y, 



2 



is Gaussian with q—covariance matrix R = p p "^ 2 {a\Kx + a 2 K 2 ). Moreover, if c is chi 
distributed with p — n + 2 degrees of freedom and independent ofY, then 

Y p 1 Y 1 + p 2 Y 2 



\ / Y T R^ 1 Y + c 2 ^ {(3iYi ~ {PiYi ~ ~ c2 

is again 71 (p, n, K) distributed with q—covariance matrix K = + a 2^) ■ 

In Figure H| below, the distribution of P% = ai-|= is represented for p + 2 = 10, 20 and 50, 
and ax = 2. 

It is clearly seen that Pi is a "fluctuating" version of the deterministic value a x = 2; since 



r f2±3) 

EPx = ax\ H5V> (24) 

we have lim p ^ +00 i?/^ = ai; moreover, the variance of Px is 

" ar (ft) = ^( 1 - p + 2 p(Sj 1' (25) 
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Figure 4. distribution of Pi = a 1-7^5 forp + 2 = 10, 20 and 50, and a± = 2 



so that lim p ^ +00 var = 0: thus the number of degrees of freedom p + 2 - imposed by the 
value of q that characterizes Y x and Y 2 through p = n + - rules the fluctuation intensity of 
(3\ around the deterministic value ot\. 

6. The Extensivity Issue 

Still a different and important question was raised in [7|, namely the extensivity issue: 
assuming that a system A is composed of two independent subsystems A\ and A 2 , the total 
q— entropy 

S q (A) = S q (A x x A 2 ) = S q {A x ) + S q (A 2 ) + (1 - q) S q (Ax) S q (A 2 ) 

is nonextensive (i.e. nonadditive) unless q = 1, which characterizes the Shannon entropy ||. 
A natural question arises then: what kind of dependence should exist between subsystems A\ 
and A 2 so that S q becomes extensive ? 

An answer has been given to this question in the case of Gaussian systems, as follows ifTTl . 

Theorem 12 If < Q < 1 and n e N then there exists a positive definite matrix K and an 
n—variate Gaussian vector X with covariance matrix K such that X verifies the extensivity 
condition 

n 

S Q (X) = Y, S Q M • ( 26 ) 

i=i 

Trying to extend this result to the distributions © and @, one should be careful about 
the following fact: if X is an n—variate random vector with probability density © or © and 
non-extensivity parameter q then any single component, say X\, of X is again of the power 
type, but with a different nonextensivity parameter, say q%, related to q via (1201) or equivalently 
(ED: 

a -I, 

^ ^ 2-(q-l)(n-l) 
|| this paragraph concerns only Tsallis entropy S„ since the Harvda-Charvat-Renyi entropy H„ is extensive 
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Thus, the choice of Q as related to q and q 1 should be decided. The choice Q = 2 — q has 
a long history in the nonextensive literature and already appeared in the paper iTTHll - for a 
thorough discussion of the issue and its physical interpretation see lfT9ll . This choice yields 
the following result. 

Theorem 13 Vm > 1 and ti6N, there exists a positive definite matrix K and an n—variate 
Student-t vector X with m degrees of freedom and q—covariance matrix K such that 

n 

H Qn {X) = Y,H Ql (X,) 

i=l 

with Qi = 2 2+(iIq)(n-i) > Qn = 2- q and Q 1 = 2-q 1 . 
This result can be extended to the case q < 1 as follows. 

Theorem 14 Vp > 1 and nGN, there exists a positive definite matrix K and an n—variate 
Student-r vector Y with p degrees of freedom and q—covariance matrix K such that 

n 

H Qn (Y) = Y,H Ql (Xi) 
with ?i = 2+(n-ij(i-gj ' Qn = 2 ~ q and Qi = 2 - q t . 
7. Conclusion 

In this communication we have presented several results concerning (i) the stability and (ii) 
the extensivity of power-law random vectors. We have shown that a certain kind of dependence 
between the components of these vectors, namely the fact that they belong to a larger system 
that is itself distributed a la power-law, ensures stability of these variables. This property is a 
direct consequence of the elliptical invariance of the associated S q or H q entropy. 
In the case of independent components, we have introduced a random-type convolution that 
ensures stability for the power law distributions. 

Finally, we have shown that S q can be additive if a proper kind of correlation is introduced 
between the components of the pertinent system, whose properties are to be described by 
power-law vectors. Further work in progress concerns the extension of this last result to the 
larger family of elliptically invariant distributions. 
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